Distance estimation in numerical data sets with missing values
نویسندگان
چکیده
منابع مشابه
Distance estimation in numerical data sets with missing values
The possibility of missing or incomplete data is often ignored when describing statistical or machine learningmethods, but as it is a common problem in practice, it is relevant to consider. A popular strategy is to fillin the missing values by imputation as a pre-processing step, but for many methods this is not necessary,and can yield sub-optimal results. Instead, appropriately...
متن کاملHandling Missing Attribute Values in Preterm Birth Data Sets
The objective of our research was to find the best approach to handle missing attribute values in data sets describing preterm birth provided by the Duke University. Five strategies were used for filling in missing attribute values, based on most common values and closest fit for symbolic attributes, averages for numerical attributes, and a special approach to induce only certain rules from spe...
متن کاملRegression imputation of missing values in longitudinal data sets.
A stand-alone, menu-driven PC program, written in GAUSS, which can be used to estimate missing observations in longitudinal data sets is described and male available to interested readers. The program is limited to the situation in which we have complete data on N cases at each of the planned times of measurement t1, t2,..., tT; and we wish to use this information, together with the non-missing...
متن کاملDistance graphs with missing multiples in the distance sets
Given positive integers m, k and s with m > ks, let Dm,k,s represent the set {1, 2, · · · ,m} − {k, 2k, · · · , sk}. The distance graph G(Z,Dm,k,s) has as vertex set all integers Z and edges connecting i and j whenever |i − j| ∈ Dm,k,s. The chromatic number and the fractional chromatic number of G(Z,Dm,k,s) are denoted by χ(Z,Dm,k,s) and χf (Z,Dm,k,s), respectively. For s = 1, χ(Z,Dm,k,1) was s...
متن کاملMixture of Gaussians for distance estimation with missing data
Many data sets have missing values in practical application contexts, but the majority of commonly studied machine learning methods cannot be applied directly when there are incomplete samples. However, most such methods only depend on the relative differences between samples instead of their particular values, and thus one useful approach is to directly estimate the pairwise distances between ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Sciences
سال: 2013
ISSN: 0020-0255
DOI: 10.1016/j.ins.2013.03.043